Admissible Time Series Motif Discovery with Missing Data

نویسندگان

  • Yan Zhu
  • Abdullah Mueen
  • Eamonn J. Keogh
چکیده

The discovery of time series motifs has emerged as one of the most useful primitives in time series data mining. Researchers have shown its utility for exploratory data mining, summarization, visualization, segmentation, classification, clustering, and rule discovery. Although there has been more than a decade of extensive research, there is still no technique to allow the discovery of time series motifs in the presence of missing data, despite the welldocumented ubiquity of missing data in scientific, industrial, and medical datasets. In this work, we introduce a technique for motif discovery in the presence of missing data. We formally prove that our method is admissible, producing no false negatives. We also show that our method can “piggy-back” off the fastest known motif discovery method with a small constant factor time/space overhead. We will demonstrate our approach on diverse datasets with varying amounts of missing data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Missing data imputation in multivariable time series data

Multivariate time series data are found in a variety of fields such as bioinformatics, biology, genetics, astronomy, geography and finance. Many time series datasets contain missing data. Multivariate time series missing data imputation is a challenging topic and needs to be carefully considered before learning or predicting time series. Frequent researches have been done on the use of diffe...

متن کامل

Pattern Discovery for Locating Motifs in Multivariate, Real-valued Time-series Data

The problem of locating motifs in multivariate, real-valued time series data concerns the discovery of sets of recurring patterns embedded in the time series. Each set is composed of several nonoverlapping subsequences and constitutes a motif because all of the subsequences are similar. This task is a natural extension of univariate motif discovery in both the symbolic and real-valued domains a...

متن کامل

Time Series Motif Discovery and Anomaly Detection Based on Subseries Join

Time series are composed of sequences of data items measured at typically uniform intervals. Time series arise frequently in many scientific and engineering applications, including finance, medicine, digital audio, and motion capture. Time series motifs are repeated similar subseries in one or multiple time series data. Time series anomalies are unusual subseries in one or multiple time series ...

متن کامل

Constrained Motif Discovery

The goal of motif discovery algorithms is to efficiently find unknown recurring patterns in time series. Most available algorithms cannot utilize domain knowledge in any way which results in quadratic or at least sub-quadratic time and space complexity. For large time series datasets for which domain knowledge can be available this is a severe limitation. In this paper we define the Constrained...

متن کامل

Motif and Anomaly Discovery of Time Series Based on Subseries Join

Time series motifs are repeated similar subseries in one or multiple time series data. Time series anomalies are unusual subseries in one or multiple time series data. Finding motifs and anomalies in time series data are closely related problems and are useful in many domains, including medicine, motion capture, meteorology, and finance. This work presents a novel approach for both the motif di...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1802.05472  شماره 

صفحات  -

تاریخ انتشار 2018